Classification using ENVI 5.2

Prof. P. Lewis & Dr. M. Disney

Remote Sensing Unit

Dept. Geography

UCL

Aims

After completing this practical, you should be able to analyse one or more image datasets using classification methods. This includes learning to identify land cover classes in a dataset and consider class separability (using histograms, scatterplots and other tools), and applying and assessing a classification product using Envi.

Advanced use of these notes

Although it is perfectly adequate to simply view the html (webpage) of these notes, there are some additional features in these notes that you can use (in this case, a convolution tool with sliders) if you access them in a different way. The reason this is possible is that these notes are written in an ipython notebook.

To use the notes as a notebook (assuming you have git and python on your computer):

  1. Copy all of the notes to your local computer (if for the first time)

    mkdir -p ~/DATA/working
    cd ~/DATA/working
    
    git clone https://github.com/profLewis/geog2021.git
    
    cd geog2021
  2. Copy all of the notes to your local computer (if for an update)

    cd ~/DATA/working/geog2021
    
    git reset --hard HEAD
    
    git pull
  3. Run the notebook

    ipython notebook Classification.ipynb

Contents

Data

The datasets you need for this practical are available from:

You should download these data and put them in a directory (folder) that you will remember!

The data you will be using are:

  • six wavebands of a Landsat TM image over Rondonia, Brazil, imaged on 25th July 1992. The data are at an original pixel spacing of 28.5 m.

  • six wavebands (nominally the same wavelengths) of a Landsat ETM image with the same spatial resolution, covering the same spatial extent. These data were obtained on 11th August 2001.

  • Digital Elevation model (DEM) data, obtained by RADAR interferometry from data on the SRTM (Shuttle Radar Topography Mission), are also available for the site. The data have been resampled to the same reolution and area as the TM/ETM data above.

The wavebands are:

1 2 3 4 5 6
450-520 nm 520-600 nm 630-690 nm 760-900 nm 1550-1750 nm 2080-2350 nm

The extent of the imagery is (Lat/Lon):

\[ 11^o 1' 31.29'' S, 62^o 58' 27.57'' W \rightarrow 11^o 57' 4.75'' S, 62^o 1' 55.96'' W \]

The full SRTM data can be loaded into google earth, if you have access to this.

Although you have the data 'pre-packaged' for this practical, you can download your own datasets using the USGS Glovis tool:

We can of course explore the area in Google Maps, which we may find useful for exploring the classification.

In [28]:
# Don't worry about this -- its just to display the google maps
from IPython.display import HTML
HTML('<iframe src=gmRondonia.html width=100% height=350></iframe>')
Out[28]:

1. Introduction

In this section, we load the image data we wish to explore.

In [19]:
run python/video.py
In [20]:
video('images/rondonia_deforestation_medium.mp4', 'x-m4v')
Out[20]:
In [29]:
src = "https://earthengine.google.org/timelapse/player?c=https%3A%2F%2Fearthengine.google.org%2Ftimelapse%2Fdata&v=-10.22878,-63.05315,6.5&r=.5&p=true"
HTML('<iframe src=%s height=854 width=100%% frameborder="0"></iframe>'%src)
Out[29]:

Importantly, we have the ability to map these changes from the archive of satellite data, particularly data from the Landsat series of satellites. An excellent introduction to visualising environmental change from Landsat data is given by Jeffrey Kluger.

Using data such as these, we can 'track' the changes in land cover over time.

For example, below we show data produced by Google and Dr. Matthew Hansen at the University of Maryland which shows global maps of forest change (2000-2012) using Landsat data (see Science article), with red showing loss in 2013 through to yellow for the year 2000 (using pseudocolour).

In [32]:
src="http://earthenginepartners.appspot.com/science-2013-global-forest?hl=en&llbox=-5.836%2C-15.772%2C-58.387%2C-68.198&t=ROADMAP&layers=layer0%2C6%2Clayer12%2Clayer9%3A100%2C1%3A100&embedded=true"
HTML('<iframe width="100%%" height=1200 src=%s style="border: 1px solid #ccc"></iframe>'%src)
Out[32]:

The purpose of this practical is for you to perform and test a land cover classification over this area, using data from two dates (1992 and 2001). The visualisations above show that there has been significant change since 2001 (and before 1992).

We will be doing this using separate classifications of the two image dates, but you should be thinking throughout about whether this is an appropriate method, and what else you might consider (especially if you had a long time series of data such as those shown in the animations).

We will be doing a supervised classification here.

The steps you will undertake are:

  • Examine the data and explore the spectral characteristics
  • Define a series of Regions of Interest (ROIs) describing the classes you wish to extract
  • Perform the classification
  • Test the result

First, obtain and then load the TM and ETM images of Rondonia noted above, along with the SRTM DEM file.

View the ETM image as a FCC.

You may need to edit the image file to associate the DEM data correctly. To do this, look under 'Raster Management' in the Toolbox, and edit the ENVI header (for the Landsat data). You should then edit the header attributes to associate the DEM with the image data.

2. Examination of the data

Load up the two images and examine the data. Try to identify the various classes you might like to obtain for this exercise decide how you can identify them. Examine feature space plots (scatter plots) to help you decide what may be feasible (and what may not). You may decide that transformations of the data (e.g. band ratios or Principal Components) might aid your ability (and the computer's ability) to discriminate between classes, but you should simply explore the data to start with.

In [38]:
# Don't worry about this -- its just to display the google maps
from IPython.display import HTML
HTML('<iframe src=gmRondoniaZoom.html width=100% height=700></iframe>')
Out[38]:

Some examples of the various classes you might consider (shown on a standard False Colour Composite (FCC) image):

Class Notes Example
Urban May also include other 'built' structures such as roads. You should be able to recognise these from their spatial structure, even at this resolution
Forest This should be easy to spot, but there are sometime clear 'shading' effects (as in this example) that might complicate classification
Rocks Rocks are quite easily identifiable in the FCC images. You would generally expect them to be static between the two dates.
Rivers There are rivers and other water bodies in the scene, which you will be able to recognise by their shape. They will be difficult to use as training sites as they are quite narrow at this resolution.
Farmland You will see a broad patchwork of areas that have been cleared of forest and used to graze cattle or raise crops. The areas a quite easy to spot in the FCC images, but might represent a broad spectral class because of the various physical cover types involved
Other You may spot some areas that have rather different spectral properties to most of the other areas. One example is shown here of field-shaped areas (green and purple areas) that might be inferred to be farmland, but are clearly different spectrally to other areas of farmland. We cannot really determine what these areas are from the information available, so you might require an 'other' class to cope with such eventualities.
Cloud The images may contain a small amount of cloud or smoke/haze, an example of which is shown here. They are quite easy to recognise visually in the FCC, but may be difficult to classify unless they are quite thick. If there are any thick clouds, you may see cloud shadows on the ground as well.

You may make use of Google Maps to explore detail of the areas, e.g., if you zoom in to the 'rock' area, you will find it is is actually more complex than just 'bare rock':

When deciding which classes may be appropriate to use, you should make use of your understanding of histograms and scatterplots, and use these to help explore the image information content.

3. Defining spectral classes

In order to classify the image data you are required to define a set of "signatures" which represent each class. These are then used to "train" the classification algorithm.

In envi, you need to define these classes via ROIs (Regions of Interest). Select the ROI tool:

and outline an ROI you want to define with the tool:

You may find the 'N-D visualiser' useful when doing this:

If you select only 2 bands to view, you will see informatyion similar to the scatterplot (i.e. 2-dimensional).

In such a view, you can readily 'see' how separable the classes might be.

In higher dimensions, the visualiser 'rotates' the view so you can get different perspectives on the classes

Note that you will want to create an ROI for each class you are interested in, but that yoy can 'merge' (or delete) classes once you have created them.

When you think you have a suitable set of ROIs, check the class separability:

This outputs Divergence metrics between the classes you have defined. These values range between 0 and 2.0. As a guide to interpretation, values greater than 1.9 indicate good separability of classes. If class separability is less than this, you might consider splitting the classes for the classification and recombining them post-classification (e.g. have two classes: forest1 and forest2).

Then, make sure you save them (to xml format):

4. Image Classification

To perform a classification, first look at the options in the Toolbox:

As a first attempt, try the Maximum Likelihood classifier.

A Tutorial is available that will take you through some of the other options.

For the Maximum Likelihood classifier, slect this itme from the Toolbox:

and perform any subsetting or masking that you might require.

Then, select the Classes you want from the ROIs you have defined, along with making decisions about whether you want to save the result or not (if not, then just send it to 'memory', but it will not then be saved at the end of the session). If you do save the result, make sure you note down (in your notebook) what the file name was and what settings you used (e.g. which classes).

You should now have a classification result:

It is generally very instructive to visualise the 'rule' image associated with a result. This provides you with the reasoning the computer used to obtain the result it did.

For a method such as that used above, the training data are used to generate multivariate statistical distributions that we suppose to describe the full class. Each pixel then can be assigned a probability of class membership. The class which has the highest membership probability is usually assigned that class label.

What issues might occur if the probability of belonging to more than one class is very similar?

There appear to be topographic effects in the class probability images: why would that be so? and what might you do about it?

5. Accuracy Assessment

It is not very difficult to produce a classified map using earth observation data. You have now been through the process ofsupervised classification (using one method).

How can we tell how good this is though?

One thing you may wish to do is to examine the post-classification class statistics:

There are various other options that you may find useful to explore in the Post Classification section of the toolbox.

A vital part of the classification process though is an assessment of classification accuracy.

This is generally done as a confusion matrix.

In setting this up, you need either to have a ground truth 'image', or a set of ROIs that can be used for ground truth.

You should first generate a new (independent) set of ROIs (or better still, use random samples) for your classes. If you use random samples, you can check what you think the land cover class should be using Google Earth/Maps as above.

Once you have your confusion matrix, make sure that you understand what it is telling you (and as far as possible, why that is so).

If the classification result seems poor, you can go back and edit your settings or class definitions and re-try, but try to keep the ROIs you use for checking independent of this process.

Make sure you understand the terms we use to describe the different accuracies shown in the confusion matrix, and also what a kappa coefficient is.

6. Further Work

In this practical, you have gone through the process of performing an image classification and assessing its accuracy.

To finish the practical, you should classify both of the Landsat datasets you have, and calculate the change in forest area between the two dates. Since you have an accuracy assessment, it should be feasible for you to put an uncertainty on that estimate of change.

7. Summary

The main aim of this practical is to reinforce your understanding of the classification process and for you to gain practical experience at this.

It would be worthwhile exploring some of the options you have available (e.g. try some different classifiers).

Since there is quite a lot of 'button clicking' in this exercise, make sure that you understand what you are doing and why you are getting the result you do -- there is very little value in the exercise otherwise!

If you have questions, ask the staff!

If you are very interested in change detection, you could explore the change detection options in ENVI.